Skip to content

Conversation

@yehudit1987
Copy link
Contributor

Add comprehensive test suite for embedding model infrastructure covering
concurrency, memory, and performance. Includes 29 integration tests,
GitHub Actions workflow, and complete documentation.

  • Concurrency tests: race conditions, deadlocks, goroutine leaks
  • Memory tests: leak detection, growth analysis, GC behavior
  • Performance tests: throughput, latency, concurrent performance
  • CI workflow with model downloads and automated testing
  • Makefile targets: test-embedding, bench-embedding
  • Comprehensive test documentation and helper utilities

Resolve issue: #715

@netlify
Copy link

netlify bot commented Nov 24, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit db0a6dd
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/69256c1a3cd79e0008e31600
😎 Deploy Preview https://deploy-preview-729--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@yehudit1987 yehudit1987 force-pushed the embedding_tests branch 2 times, most recently from fe1d06d to fd964dd Compare November 24, 2025 17:47
@github-actions
Copy link

github-actions bot commented Nov 24, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 Root Directory

Owners: @rootfs, @Xunzhuo
Files changed:

  • .github/workflows/integration-tests-embedding.yml

📁 src

Owners: @rootfs, @Xunzhuo, @wangchen615
Files changed:

  • src/semantic-router/test/integration/README.md
  • src/semantic-router/test/integration/embedding_concurrency_test.go
  • src/semantic-router/test/integration/embedding_memory_test.go
  • src/semantic-router/test/integration/embedding_performance_test.go
  • src/semantic-router/test/integration/embedding_test_helpers.go

📁 tools

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

  • tools/make/build-run-test.mk

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@yehudit1987 yehudit1987 force-pushed the embedding_tests branch 2 times, most recently from dd9cf1e to 0186df2 Compare November 24, 2025 18:17
@rootfs
Copy link
Collaborator

rootfs commented Nov 24, 2025

@OneZero-Y PTAL, thanks

Signed-off-by: Yehudit Kerido <[email protected]>
@yehudit1987 yehudit1987 marked this pull request as ready for review November 25, 2025 10:22
@yehudit1987
Copy link
Contributor Author

Optional: HF_TOKEN Configuration

This PR's tests support graceful fallback and pass without HF_TOKEN (Qwen3-only mode).

To enable full Gemma model testing, maintainers can add HF_TOKEN as a repository secret after accepting the Gemma model license.

This enables the BenchmarkEmbedding_ModelComparison benchmark and validates both model paths.

Copy link
Member

@Xunzhuo Xunzhuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like there is a gap between the tests and target goal, initially, we want to add testcases under integration test to verify the intelligentroute with signal embeddings

https://vllm-semantic-router.com/docs/api/crd-reference#embeddingsignal

to make sure signal embeddings works, this should be added to https://github.com/vllm-project/semantic-router/tree/main/e2e/testcases

@yehudit1987
Copy link
Contributor Author

yehudit1987 commented Nov 25, 2025

looks like there is a gap between the tests and target goal, initially, we want to add testcases under integration test to verify the intelligentroute with signal embeddings

https://vllm-semantic-router.com/docs/api/crd-reference#embeddingsignal

to make sure signal embeddings works, this should be added to https://github.com/vllm-project/semantic-router/tree/main/e2e/testcases

Thanks for clarifying! I want to explain how I interpreted #715, so you might see how I saw things and it will explain the gap.

Why I Built Infrastructure Tests

The issue explicitly states:

"This issue focuses on adding integration tests specifically for embedding-based signals, separate from the E2E tests"

Under "General Embedding Signal Tests" → "Integration Tests", it lists:

  • ✅ Test embedding model loading
  • ✅ Test embedding model performance
  • ✅ Test embedding model memory usage
  • ✅ Test concurrent embedding requests

I delivered exactly these tests as they were the most explicit and actionable requirements.

Where I Got Confused

The "Acceptance Criteria" mentions PII detection/domain classification but didn't specify:

  • These should be in e2e/testcases/
  • They should test IntelligentRoute CRD with EmbeddingSignal
  • How "integration tests" (mentioned at top) related to "E2E tests"

I interpreted "separate from E2E tests" as "don't add to e2e/testcases/" - now I understand that was incorrect.

Questions

  1. Do the current embedding infrastructure tests have value? Should I keep or remove them?

  2. Preferred path forward:

    • Option A: Keep this PR (infrastructure), create separate PR for signal routing E2E tests
    • Option B: Extend this PR to include both

I'm happy to add the EmbeddingSignal E2E test cases (by the way some are already exist need to add more) - just need to know if the current work should be kept or discarded.

Looking forward to your guidance!

@rootfs
Copy link
Collaborator

rootfs commented Nov 25, 2025

@yehudit1987 instead of creating a new workflow for these tests, can you use the existing test and build workflow? The tests can also be added to the candle-binding directory instead of semantic-router, since these tests don't exercise the router components. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants